Multi Domain Language Model Adaptation using Explicit Semantic Analysis
نویسندگان
چکیده
This paper presents an adaptive multi domain language model built from large sources of pre existing human created structured data. The sources’ structure is exploited to create a large array of ngram language models which are dynamically interpolated at decoding time to produce a context dependent language model that continuously adapts itself to the current domain. Because the use of human annotators is expensive and impractical we explore existing sources of human created structured data and how to extract our desired data from them. The language model is evaluated on its performance with a speech recognition system used to decode the Quaero 2009 evaluation data set. Compared to the baseline language model of our Quaero 2009 evaluation system our proposed adaptive language model reduces the WER of the speech recognition system by 0.5% absolute with some shows showing reductions of up to 14.4%.
منابع مشابه
A DSL for Explicit Semantic Adaptation
In the domain of heterogeneous model composition, semantic adaptation is the “glue” that is necessary to assemble heterogeneous models so that the resulting composed model has well-defined semantics. In this paper, we present an execution model for a semantic adaptation interface between heterogeneous models. We introduce a Domain-Specific Language (DSL) for specifying such an interface explici...
متن کاملPublic Transport Ontology for Passenger Information Retrieval
Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...
متن کاملRapid Unsupervised Topic Adaptation – a Latent Semantic Approach
In open-domain language exploitation applications, a wide variety of topics with swift topic shifts has to be captured. Consequently, it is crucial to rapidly adapt all language components of a spoken language system. This thesis addresses unsupervised topic adaptation in both monolingual and crosslingual settings. For automatic speech recognition we rapidly adapt a language model on a source l...
متن کاملRecurrent neural network language model adaptation for multi-genre broadcast speech recognition
Recurrent neural network language models (RNNLMs) have recently become increasingly popular for many applications including speech recognition. In previous research RNNLMs have normally been trained on well-matched in-domain data. The adaptation of RNNLMs remains an open research area to be explored. In this paper, genre and topic based RNNLM adaptation techniques are investigated for a multi-g...
متن کاملAuthoring Semantic and Linguistic Knowledge for the Dynamic Generation of Personalized Descriptions
We present the ELEON/NATURALOWL system, an application of Semantic Web and Natural Language Generation technologies that combines a conceptual representation of cultural heritage objects with linguistic and adaptation resources. This combined model is used to automatically generate multi-lingual and personalized textual descriptions of cultural heritage objects represented as instances of an OW...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011